Self-Modeling Agents and Reward Generator Corruption
نویسنده
چکیده
Hutter's universal artificial intelligence (AI) showed how to define future AI systems by mathematical equations. Here we adapt those equations to define a self-modeling framework, where AI systems learn models of their own calculations of future values. Hutter discussed the possibility that AI agents may maximize rewards by corrupting the source of rewards in the environment. Here we propose a way to avoid such corruption in the selfmodeling framework. This paper fits in the context of my book Ethical Artificial Intelligence. A draft of the book is available at: arxiv.org/abs/1411.1373. Self-Modeling Agents Russell and Norvig defined a framework for AI agents interacting with an environment (Russell and Norvig 2010). Hutter adapted Solomonoff's theory of sequence prediction to this framework to produce mathematical equations that define behaviors of future AI systems (Hutter 2005). Assume that an agent interacts with its environment in a discrete, finite series of time steps t ∈ {0, 1, 2, ..., T}. The agent sends an action at ∈ A to the environment and receives an observation ot ∈ O from the environment, where A and O are finite sets. We use h = (a1, o1, ..., at, ot) to denote an interaction history where the environment produces observation oi in response to action ai for 1 ≤ i ≤ t. Let H be the set of all finite histories so that h ∈ H, and define |h| = t as the length of the history h. An agent's predictions of its observations are uncertain so the agent's environment model takes the form of a probability distribution over interaction histories:
منابع مشابه
Issn 1045-6333 Corruption and Optimal Law Enforcement
We analyze corruption in law enforcement: the payment of bribes to enforcement agents, threats to frame innocent individuals in order to extort money from them, and the actual framing of innocent individuals. Bribery, extortion, and framing reduce deterrence and are thus worth discouraging. Optimal penalties for bribery and framing are maximal, but, surprisingly, extortion should not be sanctio...
متن کاملCorruption and optimal law enforcement
We analyze corruption in law enforcement: the payment of bribes to enforcement agents, threats to frame innocent individuals in order to extort money from them, and the actual framing of innocent individuals. Bribery, extortion, and framing reduce deterrence and are thus worth discouraging. Optimal penalties for bribery and framing are maximal, but, surprisingly, extortion should not be sanctio...
متن کاملReinforcement Learning with a Corrupted Reward Channel
No real-world reward function is perfect. Sensory errors and software bugs may result in agents getting higher (or lower) rewards than they should. For example, a reinforcement learning agent may prefer states where a sensory error gives it the maximum reward, but where the true reward is actually small. We formalise this problem as a generalised Markov Decision Problem called Corrupt Reward MD...
متن کاملReward Self-Reporting to Deter Corruption: An Experiment on Mitigating Collusive Bribery
This paper investigates the effectiveness of offering rewards for self reports as a means of combating collusive bribery. Rewarding self reporting theoretically sows distrust between parties tempted to exchange bribes and may reduce bribery even where authorities are otherwise ineffective in uncovering corruption. Our results indicate that offering rewards is weakly effective in reducing collus...
متن کاملModelling structural relations of craving based on sensitivity to reinforcement, distress tolerance and self-Compassion with the mediating role of self-efficacy for quitting
Background & Objectives: Craving is a major barrier to the effective treatment of substance addiction. This study conducted in order to Modelling structural relations of craving based on sensitivity to reinforcement, distress tolerance and self-compassion with the mediating role of self-efficacy for quitting. Materials and Methods: The method of this study was descriptive-correlational. The...
متن کامل